A Strategy for Genome-Wide Identification of Gene Based Polymorphisms in Rice Reveals Non-Synonymous Variation and Functional Genotypic Markers
نویسندگان
چکیده
The genetic diversity of plants has traditionally been employed to improve crop plants to suit human needs, and in the future feed the increasing population and protect crops from environmental stresses and climate change. Genome-wide sequencing is a reality and can be used to make association to crop traits to be utilized by high-throughput marker based selection methods. This study describes a strategy of using next generation sequencing (NGS) data from the rice genome to make comparisons to the high-quality reference genome, identify functional polymorphisms within genes that might result in function changes and be used to study correlations to traits and employed in genetic mapping. We analyzed the NGS data of Oryza sativa ssp indica cv. G4 covering 241 Mb with ∼20X coverage and compared to the reference genome of Oryza sativa ssp. japonica to describe the genome-wide distribution of gene-based single nucleotide polymorphisms (SNPs). The analysis shows that the 63% covered genome consists of 1.6 million SNPs with 6.9 SNPs/Kb, and including 80,146 insertions and 92,655 deletions (INDELs) genome-wide. There are a total of 1,139,801 intergenic SNPs, 295,136 SNPs in intronic/non-coding regions, 195,098 in coding regions, 23,242 SNPs at the five-prime (5') UTR regions and 22,686 SNPs at the three-prime (3') UTR region. SNP variation was found in 40,761 gene loci, which include 75,262 synonymous and 119,836 non-synonymous changes, and functional reading frame changes through 3,886 inducing STOP-codon (isSNP) and 729 preventing STOP-codon (psSNP) variation. There are quickly evolving 194 high SNP hotspot genes (>100 SNPs/gene), and 1,513 out of 2,458 transcription factors displaying 2,294 non-synonymous SNPs that can be a major source of phenotypic diversity within the species. All data is searchable at https://plantstress-pereira.uark.edu/oryza2. We envision that this strategy will be useful for the identification of genes for crop traits and molecular breeding of rice cultivars.
منابع مشابه
Genome-wide DNA polymorphisms in low Phosphate tolerant and sensitive rice genotypes
Soil Phosphorus (P) deficiency is one of the major challenges to rice crop world-wide. Modern rice genotypes are highly P-responsive and rely on high input of P fertilizers. However, low P tolerant traditional cultivars and landraces have genetic potential to sustain well under low P. Identification of high resolution DNA polymorphisms (SNPs and InDels) in such contrasting genotypes is largely ...
متن کاملGenetic diversity analysis of recombinant inbred lines of rice (Oryza sativa L.) using microsatellite markers
Estimation of genetic diversity is an important factor in germplasm conservation and characterization. In rice breeding programs, genetic diversity information on specific regions of genome can be very useful for the application of marker assisted selection (MAS) and for gene mapping. A total of 152 rice lines were considered for breeding programs using microsatellites (SSR) technique. The tota...
متن کاملBroadening Gene Pool of Rice for Resistance to Biotic Stresses Through Wide Hybridization
Variability in the cultivated germplasm for economic traits such as resistance to rice tungro virus, sheathblight, yellow stem borer, drought and salt tolerance is limited. This necessitated search for the genes in secondary and tertiary gene pool of genus Oryza. Fortunately, wild species are an important reservoir ofuseful genes for resistance to major disease, pest and tolerance t...
متن کاملIdentification of Synonymous Codon Usage Bias in the Pseudorabies Virus UL31 Gene
Background: Little knowledge of synonymous codon usage pattern of pseudorabies virus (PRV) genome, especially the UL31 gene in the process for its evolution is available. Objectives: In the present study, the codon usage bias between PRV UL31 sequence and the UL31-like sequences was identified. Materials and Methods: We used a comprehensive analysi...
متن کاملIn-silico study to identify the pathogenic single nucleotide polymorphisms in the coding region of CDKN2A gene
Background: CDKN2A, encoding two important tumor suppressor proteins p16 and p14, is a tumor suppressor gene. Mutations in this gene and subsequently the defect in p16 and p14 proteins lead to the downregulation of RB1/p53 and cancer malignancy. To identify the structural and functional effects of mutations, various powerful bioinformatics tools are available. The aim of this study is the ident...
متن کامل